Learning to Surface Deep Web Content
نویسندگان
چکیده
We propose a novel deep web crawling framework based on reinforcement learning. The crawler is regarded as an agent and deep web database as the environment. The agent perceives its current state and submits a selected action (query) to the environment according to Q-value. Based on the framework we develop an adaptive crawling method. Experimental results show that it outperforms the state of art methods in crawling capability and breaks through the assumption of full-text search implied by existing methods.
منابع مشابه
A Method for Extracting Information from the Web Using Deep Learning Algorithm
Web mining related research are getting more important now a days because of the reason that large amount of data are managed through internet. The web usage is increasing in an uncontrolled manner. A specific system is needed for controlling such large amount of data in the web space. The web mining is classified into three major divisions that are web content mining, web usage mining and web ...
متن کاملEffective Learning to Rank Persian Web Content
Persian language is one of the most widely used languages in the Web environment. Hence, the Persian Web includes invaluable information that is required to be retrieved effectively. Similar to other languages, ranking algorithms for the Persian Web content, deal with different challenges, such as applicability issues in real-world situations as well as the lack of user modeling. CF-Rank, as a ...
متن کاملThe Relationship of Study and Learning approaches with Students’ Academic Achievement in Rafsanjan University of Medical Sciences
Introduction: Most experts consider learning approach as the fundamental basis of learning dividing it into two parts of deep learning approach and surface learning approach. This is an endeavor to investigate the relationship between learning and study approaches with academic achievement among students in Rafsanjan University of Medical Sciences. Methods: This descriptive cross-sectional stu...
متن کاملAnalyzing new features of infected web content in detection of malicious web pages
Recent improvements in web standards and technologies enable the attackers to hide and obfuscate infectious codes with new methods and thus escaping the security filters. In this paper, we study the application of machine learning techniques in detecting malicious web pages. In order to detect malicious web pages, we propose and analyze a novel set of features including HTML, JavaScript (jQuery...
متن کاملتأثیر آموزش راهبردهای خود تنظیمی بر رویکردهای یادگیری دانش آموزان اول دبیرستان
Abstract The present study was conducted to determine the effect of learning self-regulation strategies on surface, deep and strategic learning approaches of high school first grade female students in Yazd. The study method was pre-test and post-test design. For this purpose, a sample size of 57 subjects was selected by multistage cluster sampling method among high school first grade female ...
متن کامل